A two level strategy for audio segmentation

نویسندگان

  • Sébastien Lefèvre
  • Nicole Vincent
چکیده

In this paper we are dealing with audio segmentation. The audio tracks are sampled in short sequences which are classified into several classes. Every sequence can then be further analysed depending on the class it belongs to. We first describe simple techniques for segmentation in two or three classes. These methods rely on amplitude, spectral or cepstral analysis, and classical hidden markov models. From the limitations of these approaches, we propose a two level segmentation process. The segmentation is performed by computing several features for each audio sequence. These features are computed either on a complete audio segment or on a frame (set of samples) which is a subset of the audio segment. The proposed approach for microsegmentation of audio data consists of a combination of a K-Mean classifier at the segment level and of a Multidimensional Hidden Markov Model system using the frame decomposition of the signal. A first classification is obtained using the K-Mean classifier and segment-based features. Then final result comes from the use of Multidimensional Hidden Markov Models and frame-based features involving temporary results. Multidimensional Hidden Markov Models are an extension of classical Hidden Markov Models dedicated to multicomponent data. They are particularly adapted to our case where each audio segment can be characterized by several features of different natures. We illustrate our methods in the context of analysis of football audio tracks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Nist 2004 Spring Rich Transcription Evaluation: Two-axis Merging Strategy in the Context of Multiple Distant Microphone Based Meeting Speaker Segmentation

This paper presents the ELISA speaker segmentation approach applied on multiple audio channel meeting recordings in the framework of NIST RT’04s meeting (spring) evaluation campaign. As done for BN data speaker segmentation, the ELISA “meeting” system involves two speaker segmentation systems developed individually by the CLIPS and LIA laboratories. The main originality consists in a “two-axis”...

متن کامل

Impact of audio segmentation and segment clustering on automated transcription accuracy of large spoken archives

This paper addresses the influence of audio segmentation and segment clustering on automatic transcription accuracy for large spoken archives. The work forms part of the ongoing MALACH project, which is developing advanced techniques for supporting access to the world’s largest digital archive of video oral histories collected in many languages from over 52000 survivors and witnesses of the Hol...

متن کامل

The effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment

The present study was conducted with the aim of the effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment.The purpose of this study is an applied research and a real experimental study. The statistical population of the present study includes all people aged 14 to 16 who are enrolled in ...

متن کامل

Impact of Corporate Reputation on Brand Segmentation Strategy : An Empirical Study from Iranian Pharmaceutical Companies

Abstract Impact of Corporate Reputation on Brand Segmentation Strategy : An Empirical Study from Iranian Pharmaceutical Companies The impact of corporate reputation uses, including value creation, corporate communication and strategic resources on branding strategies such as segmentation and producing intangible assets for different industries is investigated in western countr...

متن کامل

A Two Level Classifier Process for Audio Segmentation

We are dealing in this paper with audio segmentation. We propose a two level segmentation process that enables the audio tracks to be sampled in short sequences which are classified into several classes. The segmentation is performed by computing several features for each audio sequence. These features are computed either on a complete audio segment or on a frame (set of samples) which is a sub...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Digital Signal Processing

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2011